Fergus Finn

mentions 1 type Person feed RSS

// recent coverage 1 mentions

00:00

2026-06-22

fergusfinn.com

large-language-models

Adaptive speculative decoding: picking draft lengths at runtime

Researchers have developed adaptive speculative decoding, a method that dynamically selects draft lengths at runtime to optimize token generation efficiency in large language models. The approach addr…

// co-occurs with top 5 entities

Qwen 1 DeepSeek 1 GatedDeltaNet 1 DFlash 1 Kimbell Art Museum 1